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1      Introduction 

Distributional  analysis  has  been  a  widely  used  technique  in  the  study  of 
social  choice  in  Euclidean  models  [28,29,1,3,8,15,5,2,23]  (see  also  [4]  and 
[19,  Chaps.  11-12])  for  more  than  two  decades.  In  distributional  analysis,  a 
continuum  or  infinite  population  of  voters  is  analyzed,  where  the  population 
follows  some  probability  distribution  //. 

Infinite  populations  do  not  exist.  Therefore,  the  principal  purpose  of 
distributional  analysis  must  be  to  give  insight  into  the  behavior  of  large 
but  finite  populations. 

In  this  paper  it  is  shown  that  distributional  analysis  is  flawed  when 
applied  to  this  end.  The  problem  is  essentially  one  of  convergence:  if  the 
limiting  case  is  to  give  insight  into  the  large  finite  case,  behavior  of  the  latter 
should  converge  to  behavior  of  the  former  as  the  population  grows.  Unfor- 
tunately, it  turns  out  that  properties  of  finite  populations  do  not  in  general 
converge  to  the  properties  of  infinite  populations.  In  some  cases  a  distri- 
butional analysis  will  predict  that  a  point  is  in  the  core  with  probability  1, 
while  the  true  probability  converges  to  0.  Thus  analysis  of  infinite  popu- 
lations may  fail  to  yield  any  information  about  finite  populations,  however 
large. 

An  alternative  technique 

An  alternative  probabilistic  technique  for  the  study  of  social  choice  is 
termed  here  the  finite  sample  method.  In  this  method,  n  points  are  in- 
dependently sampled  from  the  distribution  /z.  This  random  finite  sample 
from  //  forms  a  configuration  of  n  points  whose  properties  are  analyzed. 
A  typical  question  would  be:  "what  is  the  probability,  as  a  function  of  n, 
that  the  configuration  generated  has  nonempty  core?"  Typical  answers  to 
these  questions  are  bounds  or  asymptotically  close  estimates  for  the  desired 
probability. 

It  is  sometimes  possible  to  combine  distributional  analysis  with  finite 
sample  analysis  to  make  correct  predictions  about  the  asymptotic  behavior 
of  large  populations.  An  example  of  this  is  found  in  [2],  We  expose  some 
key  properties  which  enable  the  convergence  in  this  case,  enabling  a  simpler 
and  more  general  proof  of  the  convergence  of  min-max  majority  rule.  We 
also  estimate  the  population  size  for  which  the  results  are  meaningful,  i.e., 


*C 


at  which  convergence  begins  to  take  hold.  For  committee  sizes  of  10,000 
or  more,  a  2/3  majority  rule  is  likely  to  be  stable,  under  the  concavity 
assumptions  of  [2].  For  committee  sizes  of  250  or  less,  there  is  some  doubt 
as  to  whether  2/3  majority  rule  is  necessarily  stable. 

Following  a  suggestion  due  to  Robert  Foley,  Richard  McKelvey,  and 
Gideon  Weiss,  we  explore  the  use  of  uniform  convergence  theorems  to  trans- 
form distributional  results  into  finite  sample  results.  Theorems  about  the 
uniform  convergence  of  empirical  measures  [18,  e.g.]  yield  a  simpler  and 
more  general  proof  of  Simpson-Kramer  min-max  convergence[2]  and  a  sim- 
pler though  less  general  proof  of  yolk  shrinkage [26].  The  analysis  suggests 
a  rule  of  thumb  as  to  when  one  might  expect  distributional  analysis  to  give 
accurate  or  inaccurate  predictions  about  the  behavior  of  finite  populations. 

A  careful  reading  of  Tullock's  original  paper  [28]  reveals  a  clear  insightful 
distinction  between  the  distributional  and  finite  sample  methods,  and  a 
remarkable  foreshadowing  of  some  of  the  outcomes  of  finite  sample  analysis. 

Empirical  study  of  social  choice 

Another  motivation  for  analyzing  the  distributional  method,  besides  the 
clarification  of  results  in  the  literature,  is  to  help  uncover  a  rigorous  foun- 
dation for  statistical  empirical  study  of  group  choice.  One  would  like  to  poll 
the  members  of  a  committee,  assembly,  or  population  (or  in  some  other  way 
extract  data  on  their  positions  on  the  issues),  and  based  on  that  data  and 
some  solution  concept,  make  a  prediction  with  some  confidence  regarding 
what  the  outcome  will  be.  How  do  we  experimentally  test  a  solution  con- 
cept? Ignoring  the  difficulties  of  data  acquisition  (e.g.  sincerity),  and  any 
computational  issues,  there  is  still  a  problem  regarding  the  stability  of  the 
solution  concept  with  respect  to  individual  perturbations.  In  other  words, 
a  person's  views  on  issues  are  not  perfectly  constant;  one  can  even  change 
one's  mind  in  the  voting  booth.  How  can  we  know  that  a  prediction  based 
on  polls  taken  one  day  is  apt  to  be  close  to  the  actual  results  the  next  day? 

We  may  think  of  each  person's  views  as  having  a  probability  distri- 
bution. When  we  interview  a  person  we  get  a  random  sample  from  this 
distribution.  When  that  person  votes  or  negotiates  in  committee,  it  is  on 
the  basis  of  another  random  sample  from  this  distribution.  The  problem 
is  to  establish  rigorously  the  stability  of  a  solution  concept  under  these 
conditions. 

In  statistical  terms,  the  finite  sample  from  \i  yields  an  empirical  measure 


fj,n.  A  solution  concept  is  a  statistic,  a  function  /  operating  on  probability 
measures.  If  /  is  a  consistent  statistic,  then  the  limiting  behavior  of  /(/in) 
will  (almost  surely)  be  like  /(/*)  and  the  solution  concept  is  stable. 

This  issue  has  received  a  great  deal  of  attention  for  the  classical  core 
or  Nash  equilibrium  under  the  term  "structural  stability".  The  outcome 
is  negative:  the  Nash  solution  concept  is  not  usually  applicable,  and  is 
never  structurally  stable  in  three  or  more  dimensions  [20].  In  section  6  we 
illustrate  how  theorems  for  the  uniform  convergence  of  empirical  measures 
[18,  e.g.]  can  be  invoked  to  establish  the  stability  of  other  more  widely 
applicable  concepts. 

The  outline  of  the  paper  follows:  the  remainder  of  this  section  reviews 
essential  definitions  of  the  spatial  model.  Section  2  introduces  the  two 
methods  by  way  of  a  small  example.  Section  3  analyzes  the  distributional 
method.  Section  4  demonstrates  in  greater  detail  a  case  from  [l]  where  the 
distributional  method  gives  a  misleading  result.  Section  5  discusses  a  case 
(the  64%-rule  of  Caplin  and  Nalebuf  [2])  where  the  finite  sample  method 
may  be  combined  with  the  distributional  method  to  achieve  results  valid 
for  large  finite  populations.  Large  is  argued  to  be  somewhere  between  250 
and  10,000  in  this  case.  Section  6  introduces  the  use  of  uniform  conver- 
gence of  empirical  measures  and  discusses  in  general  when  one  may  expect 
the  distributional  method  to  be  useful  and  when  we  may  expect  it  to  be 
misleading.  Section  7  concludes  by  re-examining  Tullock's  original  paper 
[28]. 

1.1      Definition  of  the  spatial  model 

In  the  Euclidean  spatial  model,  a  social  choice  involving  m  issues  is  to  be 
made.  The  possible  proposals  are  represented  as  vectors  in  3ftm.  Each  in- 
dividual x  has  a  most  preferred  point  X{  6  3ftm.  This  point  will  be  referred 
to  as  a  voter  point,  or  simply  a  voter.  Under  Euclidean  preferences,  an  in- 
dividual faced  with  two  alternatives  will  select  the  one  closest  to  her  most 
preferred  point,  under  the  Euclidean  norm.  This  model  is  more  general 
than  it  appears:  Davis  et  al.  [3]  show  it  is  equivalent  to  any  linearly  trans- 
formed spatial  model  which  maintains  the  properties  of  an  inner  product; 
Grandmont  [8]  (see  also  [2,  section  5])  observes  that  the  essential  property 
of  the  Euclidean  model  is  often  the  "division-by-hyperplane"  property  (in 
the  Euclidean  case,  the  perpendicular  bisector  of  two  points  separates  those 


who  prefer  one  point  to  the  other),  and  so  results  in  the  Euclidean  model 
usually  apply  to  the  more  general  class  of  "intermediate  preferences",  in- 
cluding constant  elasticity  of  substitution  (C.E.S.)  utility  functions  (these 
extend  the  class  of  Davis  et  ai.  by  allowing  a  change  to  an  IP  norm  from 
the  L2  norm). 

2      Two  methods  and  an  example 

Let  us  begin  with  a  simple  two-dimensional  model  based  on  an  example  in 
[23].  Let  /x  be  a  probability  distribution  that  is  uniform  on  a  circle  (the 
circumference  of  a  disk).  Place  a  single  voter  vi  at  the  center  of  the  circle, 
which  for  convenience  we  locate  at  the  origin.  Randomly  generate  n  —  1 
additional  voter  points  v2,  •  •  • ,  vn,  where  n  is  even,  according  to  /i. 

We  introduce  some  terminology.  A  particular  realization  of  this  random 
process  is  a  configuration,  a  specific  set  of  points  V  =  {u1,...,un}  with 
specific  locations  in  5Rm.  In  this  case  V  is  a  finite  configuration.  If  |V|  is 
infinite  V  is  an  infinite  configuration. 

the  finite  sample  method 

Next  we  illustrate  the  method  of  finite  sample  analysis  on  the  model 
just  stated.  The  question  we  pose  is:  what  is  the  probability  that  vi  is 
undominated  in  the  configuration  V?  A  result  of  Schofield's  [23]  implies 
that  the  probability  is  positive,  but  the  exact  probability  was  not  known 
until  recently:  v  is  undominated  with  probability  l/2n~2.  Notice  that  the 
answer  to  the  question  is  parameterized  by  n,  as  one  would  expect.  The 
proof  is  sketched  here  since  it  will  be  needed  in  Sections  4  and  5. 

Theorem  1  Place  Vi  at  the  origin  and  generate  V2,.  ..,v„  independently 
according  to  any  nondegenerate  sign-invariant  distribution  p.  Then  for  all 
even  n,  the  probability  Vi  is  undominated  is  l/2n~2. 

Proof: ([25])  Associate  for  each  0,  0  <  6  <  n,  a  line  passing  through  the 
origin  and  an  associated  orientation.  See  Figure  1.  Denote  this  line  by 
L{9).  The  open  half  space  the  line  is  oriented  towards  is  the  "front"  and 
the  other  open  half  space  is  the  "back"  of  the  line  L{9). 

Since  the  points  are  drawn  from  a  nondegenerate  distribution,  the  prob- 
ability is  0  that  any  pair  of  the  points  v2, . . . ,  vn  are  collinear  with  the  origin. 
Henceforth  we  assume  this  event  does  not  occur. 


For  any  0  <  9  <  n  define  the  gap  function  g(9)  to  equal  the  number 
of  voter  points  in  the  front  half  plane  of  L(9)  minus  the  number  of  voter 
points  in  the  back  of  L(9).  If  g{9)  =  —  1  or  1,  the  line  L{9)  divides  the  n  —  1 
points  V2»---?vn  as  equally  as  possible  given  that  n  is  even.  If  however 
the  gap  function  g{9)  ever  attains  \g(9)\  >  3  then  one  side  of  the  line  will 
contain  at  least  1  +  n/2  points  and  vx  will  not  be  a  core  point. 

Starting  at  0  =  0,  increase  9  continuously  to  7r.  Because  no  two  points 
are  collinear  with  i>i,  g{6)  will  change  by  either  -f2  or  -2  as  the  line  L(8) 
crosses  over  a  point  v,-.  Let  $it . . . ,  0n-i  denote  the  values  of  9  at  which  L(6) 
crosses  over  a  voter  and  let  X,  =  +2  or  —  2  accordingly  as  the  xth  crossover 
increases  or  decreases  g(9).  The  key  observation  is  that  the  gap  function 
executes  a  random  walk  as  6  goes  from  0  to  7r. 

Lemma  1.  X\ , . . . ,  Xrt_j  are  independent  identically  distributed  variables 
taking  values  2  with  probability  1/2  and  —2  with  probability  1/2. 
Proof  of  Lemma:  the  proof  follows  easily  from  the  sign-invariance  of  /z.  See 
Figure  2:  the  regions  I  and  II  are  equally  likely  to  contain  the  next  point 
as  we  sweep  L{9)  around.  Details  are  given  in  [25]. 

There  are  2n_1  possible  paths  for  the  random  walk  of  the  X,  to  take. 
Of  these,  only  two  will  keep  the  gap  function  at  \g(9)  <  l|.  These  are  the 
alternating  paths  X,-  =  2(— l)'  and  X,-  =  —  2(— 1)\  Any  nonalternating 
/-"sequerree  sequence  must  contain  two  consecutive  +2's  or  two  consecutive  - 
2's.  If  this  ever  happens,  the  gap  function  will  change  by  4  and  so  must  leave 
the  range  {  —  1,1}.  By  Lemma  1,  each  of  the  2n_1  possible  paths  occurs  with 
equal  probability  l/2n_1.  Therefore  the  probability  that  max*  \g{9)\  <  1  is 
2/2n_1  =  l/2n"2  as  desired.  This  completes  the  proof  of  Theorem  1. 

A  few  remarks  about  Theorem  1:  the  proof  only  assumes  (i  is  sign- 
invariant:  //(y)  =  n{—y),  so  it  applies  to  the  uniform  rectangle  distribution 
in  [29,1]  and  many  others.  For  the  model  under  discussion,  Theorem  1 
gives  a  stronger  outcome,  for  obviously  (w.p.l)  no  other  point  in  9R2  can 
be  undominated.  Thus  the  configuration  V  has  nonempty  core  with  exact 
probability  l/2n"2. 

the  distributional  method 

Now  let  us  illustrate  the  distributional  method  on  the  same  model.  (The 
following  closely  follows  analyses  in  [29,23,3,8]).  Assume  a  continuum  of 
voters  uniformly  distributed  on  the  circle.  Every  line  passing  through  0  has 
mass  <  1/2  on  either  side  of  it.  That  is,  each  halfspace  h  defined  by  a  line 


through  0  has  fx(h)  <  1/2.  Thus  v  is  undominated.  In  fact  by  [3,  theorem 
l]  or  [15,  theorem  2]  it  is  the  unique  "dominant"  or  undominated  point. 
(The  reader  who  is  concerned  about  the  "extra"  point  at  0  may  observe 
that  this  only  improves  the  position  of  0  with  respect  to  equilibrium.) 

The  contrast  between  the  two  methods  is  evident.  The  finite  sample 
method  shows  that  the  probability  of  0  being  undominated,  indeed  of  a 
nonempty  core,  rapidly  converges  to  0.  The  distributional  method  says 
that  for  an  infinite  population,  the  probability  of  0  being  undominated  is 
1. 

The  example  of  this  section  reveals  that  there  is  a  flaw  in  the  distribu- 
tional method.  It  would  be  desirable  for  the  outcome  of  the  distributional 
method  to  coincide  with  the  limiting  behavior  of  finite  samples,  since  the 
goal  must  be  insight  into  the  behavior  of  finite  populations.  Yet  there 
could  hardly  be  less  consonance  than  in  the  example  just  given.  In  the 
next  section  we  analyze  the  distributional  method  to  explain  how  this  flaw 
arises. 

3      An  analysis  of  the  distributional  method 

A  contrast  with  distributional  analysis 

We  have  observed  that  the  outcomes  of  the  two  methods  can  differ. 
Let  us  point  up  an  important  distinction  in  how  they  operate.  The  dis- 
tributional method  works  directly  with  (X,  and  quantities  such  as  ix{h)  are 
considered.  On  the  other  hand,  in  the  finite  sample  method  a  configura- 
tion V  is  drawn  from  /z,  and  quantities  such  as  \V  D  h\  are  considered. 
Informally,  the  distributional  method  counts  up  voters  by  looking  at  the 
distribution  function  \x  directly,  while  the  finite  sample  method  counts  up 
voters  by  looking  at  configurations  drawn  from  it. 

More  formally,  the  distribution  function  \i  analyzed  in  finite  sample 
analysis  is  not  an  infinite  configuration,  rather  it  is  a  probability  measure 
defined  on  the  appropriate  CI,  which  for  fixed  n  may  be  thought  of  as  the  set 
of  all  possible  configurations  of  cardinality  n.  In  contrast  the  distributional 
method  treats  /x  as  an  infinite  configuration. 

A  brief  history  of  distributional  analysis 


BTHheJirerature,  ihe  term  distribution  is  used  in  the  economics  litera- 
ture to  mean  both  "configuration"  and  "distribution  function"  as  defined 
here.  If  we  examine  the  literature  of  distributional  analyses,  we  find  that 
it  is  intertwined  with  analyses  giving  necessary  and/or  sufficient  conditions 
for  domination,  local  equilibrium,  and/or  global  equilibrium  in  finite  con- 
figurations (to  use  terminology  defined  here)  [17,15,3,1,23,16].  For  instance, 
Plott's  classic  paper  [17]  is  titled 

A  notion  of  equilibrium  and  its  possibility  [emphasis  added]  un- 
der majority  rule. 

Plott  performs  no  probabilistic  analysis  but  observes  (quite  rightly)  [IBID, 
page  792]  that 

it  would  only  be  an  accident  (and  a  highly  improbable  one)  if 
an  equilibrium  exists  at  all. 

Tullock's  analysis  [28]  is,  as  Davis  et  al.  [3,  page  148]  observe,  "informally 
developed  without  theorems  or  proofs  by  the  device  of  insightful  examples." 
Later  papers  such  as  [15,3,16]  meld  these  analyses  by  formalizing  ideas  of 
Tullock  [28,29]  and  simultaneously  generalizing  Plott's  results  to  infinite 
populations  and/or  more  general  preference  functions  (also  global  rather 
than  local  equilibrium).  For  instance,  Davis  et  al. [3,  page  148]  contrast 
their  work  with  Plott's  since  the  latter 

allows  only  a  finite  number  of  individuals  to  be  considered. 

Presumably  Davis  et  al.  view  this  "limitation"  of  Plott's  analysis  as  un- 
desirable because  more  insight  is  needed  as  to  the  behavior  of  large  finite 
populations. 

In  1981  however  Tullock  remarks  [30,  page  190]  that  his  analysis  was 
"not  regarded  as  very  reliable  any  more  because  McKelvey  proved  that  ma- 
jority voting  can  reach  any  part  of  the  issue  space."  The  analysis  Tullock 
refers  to  ultimately  showed  (see  [22,23,20,  e.g.])  that  the  set  of  configu- 
rations for  which  equilibrium  exists  is  measure  0,  for  d  >  3  and  also  for 
d  =  2  and  odd  n,  confirming  what  Plott  had  said  all  along.  These  pow- 
erful results  seem  implicitly  to  invalidate  the  distributional  analyses.  Yet, 
this  consequence  does  not  even  now  appear  to  be  fully  assimilated  in  the 


literature.  The  only  unresolved  case  was  d  =  2,n  even,  (and  that  was  my 
original  motive  for  undertaking  this  line  of  research.) 

Where  is  the  flaw  in  distributional  analysis? 

We  have  seen  that  some  of  the  distributional  analyses  suggested  im- 
plications at  odds  with  the  instability  theorems  of  McKelvey,  Schofield, 
Rubenstein,  and  others[l2, 13,21,20],  So  is  there  a  flaw  in  the  distributional 
arguments,  and  if  so  what  is  it?  The  crucial  part  is  lucidly  exposed  by 
Arrow  in  his  1969  paper  [l].  Summarizing  Tullock's  analysis,  Arrow  writes 
[page  108): 

He  [Tullock]  assumes 

(l)  that  the  number  of  voters  is  large,  so  large  that  we  may 
consider  them  to  constitute  a  continuum. 

This  assumption  seems  innocuous  enough.  In  the  mathematics  literature, 
passing  to  the  limiting  continuous  case  is  a  popular  technique.  The  problem 
is  that  majority  rule  requires  us  to  evaluate  n/2  where  n  =  the  number 
of  voters,  but  the  value  oo/2  is  not  well-defined.  More  precisely,  if  0  is 
undominated  and  only  one  voter  is  located  at  0  then  placing  two  additional 
voters  together  at  any  location  x  ^  0  must  make  0  dominated  (by  the  point 
ex  for  sufficiently  small  e  >  0.  But  if  n  is  treated  as  infinite  no  shifting  of 
any  finite  number  of  voters  changes  the  analysis,  since  oo/2  +  1  =  oo/2. 

What  happens  is  that  a  new  definition  is  needed  when  passing  from  the 
finite  to  the  infinite  case.  Let  us  examine  a  specific  definition  from  the 
literature.  In  an  article  by  Davis,  DeGroot,  and  Hinich  [3],  necessary  and 
sufficient  conditions  are  derived  for  the  existence  of  a  dominant  point.  As 
stated  earlier,  this  analysis,  unlike  Plott's,  is  intended  to  apply  to  infinite 
populations.  The  critical  definition  of  a  non-dominance  relation  R  is  quoted 
below  [3,  page  149]. 

Let  P*  denote  the  distribution  of  most  preferred  points  of  the 
individuals.  Let  X  be  the  most  preferred  point  of  an  individual 
chosen  at  random  from  the  population,  [note  P*  is  referred  to 
as  an  infinite  configuration  in  the  previous  sentence  and  as  a 
probability  function  in  the  next  sentence]  Given  a  (Borel)  set 
S  C  En,  Pr(S)  will  denote  the  probability  that  X  G  S  under 
the  distribution  P*. 
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Definition  1:  For  any  points  y  G  En  and  z  £  En,  it  is  said 
that  yRz  if  Pr(||y  -  X\\  <  \\z  -  X\\)  >  §. 

The  definition  of  the  relation  R  just  given  is  mathematically  unambigu- 
ous and  therefore  is  mathematically  correct.  The  mathematics  in  the  paper 
[3]  is  of  course  correct.  But  there  is  a  problem  with  the  interpretation  of 
the  mathematical  results.  In  [3]  the  passage  just  cited  continues  with  the 
following  interpretation: 

In  other  words,  yRz  if  and  only  if  at  least  half  the  population 
either  prefers  y  to  z  or  is  indifferent  between  y  and  z. 

What  does  the  word  "population"  mean  in  the  sentence  just  quoted?  If 
we  take  it  to  mean  the  probability  measure,  then  it  would  be  accurate  to 

say  that 

yRz  if  and  only  the  measure  (mass)  of  the  subset  of  the  popu- 
lation, that  either  prefers  y  to  z  or  is  indifferent  between  y  and 
2,  is  at  least  1/2. 

But  if  the  word  "population"  refers  to  a  finite  sample  drawn  from  the 
distribution  P' ,  then  the  meaning  of  yRz  is  given  by  the  following  theorem. 
Theorem  2.  Suppose  a  finite  number  of  points  are  drawn  at  random 
according  to  the  distribution  P* .  Then 

yRz  if  and  only  if  the  probability  is  at  least  1/2  that  at  least 
half  the  population  either  prefers  y  to  z  or  is  indifferent  between 
y  and  z. 

Proof:  Suppose  yRz.  If  we  were  to  take  a  finite  sample  under  the  distri- 
bution P*,  each  sample  point  would  with  probability  at  least  1/2  be  at 
least  as  close  to  y  as  to  z.  Then  the  number  of  points  in  the  sample  at 
least  as  close  to  y  as  to  z  follows  a  binomial  distribution  with  "success" 
parameter  p  >  1/2.  From  the  most  elementary  properties  of  the  binomial 
distribution  p  >  1/2  implies  the  probability  is  at  least  1/2  that  at  least  half 
the  outcomes  are  "successes".  Conversely,  if  the  probability  is  at  least  1/2 
that  at  least  half  of  the  Bernoulli  trials  end  in  success,  it  must  be  that  the 
parameter  p  >  1/2,  whence  yRz. 


3.1      The  heart  of  the  problem 

We  have  arrived  at  the  heart  of  the  problem.  When  going  from  finite 
to  infinite  populations,  a  new  definition  of  the  nondominance  relation  R 
was  needed.  Succinctly,  let  A  denote  "at  least  half  the  population  either 
prefers  y  to  z  or  is  indifferent  between  y  and  z."  Then  for  any  finite 
sample  population,  yRz  means  that  A  occurs  with  probability  1/2.  But  the 
interpretation  for  infinite  populations  in  [3]  is,  yRz  means  that  A  occurs. 

If  the  purpose  of  the  mathematical  analysis  of  infinite  populations  in  [3] 
is  to  gain  insight  into  the  behavior  of  large  finite  populations,  then  there 
should  be  a  closer  correspondence  between  the  meanings  of  yRz  for  finite 
samples  and  for  infinite  populations. 

The  gap  between  the  finite  sample  (Theorem  2)  and  the  distributional 
(Definition  1)  methods  just  discussed  is  between  1/2  and  1.  In  the  earlier 
example  of  section  2  involving  Theorem  1,  the  gap  was  (asymptotically) 
between  0  and  1.  The  larger  gap  in  that  example  was  due  to  the  intersection 
of  many  events  each  with  probability  1/2. 

4      An  unsuccessful  case:   The  Sonnenschein- 
Arrow  Theorem 

Let  us  now  examine  a  specific  case  of  analysis  from  the  literature  where 
the  predictions  of  distributional  analysis  are  misleading.  In  his  article, 
Arrow  continues  by  stating  a  theorem  (he  attributes  to  Sonnenschein)  that 
generalizes  Tullock's  example  [l,  pages  108-109]: 

For  any  pair  of  alternatives  x,y,  let  N{x,y)  be  the  number 
of  individuals  who  prefer  x  to  y.  Then  let  xMy  be  the  state- 
ment N(x,y)  >  N(y,x)  and  xMy  the  statement  that  N(x,y)  > 
N(y,x).... 

Theorem.  Suppose  that,  for  each  alternative  x°,  the  set  of 
alternatives  x  for  which  xMx°  is  closed,  and  [suppose]  the  set 
of  alternatives  [xj  for  which  xMx°  is  convex.  Then  for  any 
compact  (closed  and  bounded)  convex  set  of  alternatives  S,  there 
is  (at  least)  one  alternative  x  in  S  such  that  xMy  for  all  y  in 
S. 
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Arrow  later  points  out  that  "the  hypotheses  of  the  theorem  are  obvi- 
ously fulfilled  in  Tullock's  example."  [IBID,  page  110].  This  is  of  course 
correct,  but  only  subject  to  assumption  (l)  above.  For  if  we  employ  the  fi- 
nite sample  method  of  this  paper,  we  find  the  probability  converges  to  0  that 
the  hypotheses  of  the  Sonnenschein- Arrow  theorem  are  fulfilled  in  Tullock's 
example.  The  following  theorem  states  and  proves  this  statement  precisely. 

Theorem  3.  Under  the  hypotheses  of  Tullock's  example  or  of  Theorem  1, 
the  probability  that  the  set  \x  :  xMOl  is  convex  converges  to  0  as  n  — ►  oo. 

Proof:  It  suffices  to  consider  only  the  more  generous  assumption  of 
Theorem  1.  Recall  the  gap  function  g{6)  used  in  the  proof  of  Theorem 
1.  If  (and  only  if)  g(9)  >  1  then  a  strict  majority  of  the  points  are  in 
the  halfplane  defined  by  the  normal  vector  ve  with  orientation  8.  Then  for 
sufficiently  small  e  >  0,  the  point  y  =  ev9  dominates  0,  i.e.  y  is  in  the 
set  [x  :  xMO].  Rotate  the  vector  v  through  an  open  halfplane,  i.e.  let  6 
range  in  the  interval  [0,  w).  If  the  gap  function  g{9)  ever  exceeds  1,  drops 
to  1  (or  below),  and  later  exceeds  1  again,  the  set  [x  :  xMO]  will  fail  to 
be  convex  (see  Figure  3).  This  is  because  there  will  exist  distinct  values 
0  <  8i  <  92  <  #3  <  7r  such  that  for  all  sufficiently  small  €  >  0,  (t,#i)  and 
(e,03)  are  in  the  set,  but  (^,#2)  is  n°t  in  the  set  (using  (r,  6)  notation).  If 
the  random  walk  executed  by  the  gap  function  behaves  in  this  fashion,  then 
the  set  is  not  convex,  and  we  call  the  walk  "bad". 

By  Lemma  1,  the  values  of  the  gap  function  execute  an  unbiased  random 
walk  centered  around  0.  Therefore  we  may  select  the  orientation  of  6  =  0  so 
that  the  walk  has  n  —  2  steps  and  starts  at  1.  By  the  recurrence  properties 
of  one  dimensional  symmetric  random  walks  [6,  e.g.],  the  walk  is  bad  with 
probability  1  as  n  — ►  00.  In  fact  it  will  be  bad  infinitely  often,  so  the  set 
[x  :  xMO]  will  have  many  nonconvexities.  This  proves  the  theorem. 

It  has  previously  been  observed  that  the  Sonnenschein- Arrow  Theorem 
can  fail  to  be  applicable.  Greenberg  [9],  in  a  lovely  paper  on  d-majority 
equilibrium,  gives  a  deterministic  example  with  n  =  4  voters  in  which 
the  set  \x  :  xMO]  is  not  convex.  At  the  time  it  must  have  seemed  that 
examples  such  Greenberg's  would  become  less  likely  as  n  increased.  For 
instance  Kramer  [11,  page  313]  remarks, 

Several  authors,  . . .  have  argued  that  this  instability  is  a  "small- 
sample"   problem,  and  that  majority  equilibria  will  be  more 
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likely  when  the  number  of  voters  is  large;  examples  and  results 
supporting  this  thesis  have  been  exhibited  by 

Theorem  3  demonstrates  that  Greenberg's  example  represents  the  rule, 
not  the  exception. 

5      A  successful  case:  The  64%-rule 

Although  the  distributional  method  can  mislead,  it  sometimes  gives  per- 
fectly accurate  predictions  of  the  asymptotic  behavior  of  finite  populations. 
An  excellent  example  is  found  in  a  recent  paper  by  Caplin  and  Nalebuf  [2]. 
They  consider  a  class  of  voting  procedures,  parameterized  by  0  <  6  <  1,  in 
which  the  status  quo  or  incumbent  can  only  be  defeated  or  dislodged  if  more 
than  6  of  the  population  supports  the  contesting  alternative.  Caplin  and 
Nalebuf  first  employ  the  distributional  method:  they  show  that  if  the  distri- 
bution function  /x  is  concave,  then  the  smallest  6  that  guarantees  an  equilib- 
rium (undefeatable)  point,  called  the  Simpson-Cramer  min-max  majority,  is 
1  —  {m/{m  +  l))m.([2,  Theorem  2]).  They  continue  and  prove  ([2,  Theorem 
3],  essentially  the  same  result  is  apparently  found  in  [5,  2.4(iii), pp. 151-152, 
5.3  p. 164])  that  if  a  finite  sample  of  size  n  is  drawn  at  random  from  the  con- 
cave distribution  iz,  then  the  min-max  majority  of  the  sample  converges  to 
the  min-max  majority  of  /z  a.e..  Hence,  "the  bounds  of  the  paper  extend  to 
large  finite  populations  drawn  from  a  concave  density" [2,  page  801].  Thus 
the  distributional  method  is  a  success  in  this  case. 

One  must  take  some  care  in  applying  the  bounds  to  the  finite  case. 
Consider  a  uniform  population  density  on  an  equilateral  triangle  (see  Figure 
4).  The  mass  of  /x  in  the  shaded  region  is  5/9;  it  follows  that  the  chances 
are  close  to  50%  that  more  than  5/9  of  a  random  sample  will  fall  in  the 
shaded  region.  But  if  this  occurs,  the  center  will  not  be  a  5/9-majority  core 
point,  however  slightly  the  sample  fraction  exceeds  5/9. 

In  fact  a  stronger  statement  is  true:  the  triangle  center  is  a  5/9-majority 
rule  point  with  (asymptotic)  probability  no  more  than  1/8. 

Theorem  4.  Let  n  ideal  points  be  generated  independently  from  the  uni- 
form distribution  on  a  regular  triangle.  Let  pn  denote  the  probability  that 
the  triangle  center  is  a  5/9  majority  point.  Then  limsupn  {pn}  <  1/8. 
Proof:  see  appendix. 
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Theorem  4  does  not  negate  Theorem  2  of  [2]  in  a  substantial  way.  To 
begin  with,  there  is  the  possibility  that  some  other  point  very  close  to  the 
triangle  center  is  undefeated.  But  more  importantly,  suppose  that  for  any 
e  >  0,  a  (5/9  +  £)-majority  rule  were  employed.  Then,  by  the  almost  sure 
convergence  of  Theorem  3  of  [2],  the  probability  converges  to  1  that  the 
triangle  center  is  a  majority  point. 

One  of  the  beautiful  things  about  Theorem  2  of  [2]  is  the  dimension-free 
corollary  that  1  —  1/e-majority  rule  will  have  a  core,  (which  leads  to  the 
title  of  the  paper).  Since  for  any  fixed  dimension  m,  there  exists  an  e  >  0 
such  that  1  —  (m/(ra  +  l))m  +  £  <  1  —  1/e,  the  analog  of  Theorem  4  is 
false  for  the  dimension-free  corollary.  That  is,  an  immediate  and  very  nice 
consequence  of  Theorem  3  of  [2]  and  its  corollary  is  the  following: 

Corollary:  Let  n  points  be  sampled  independently  from  any  concave  dis- 
tribution on  5Rm.  Then  the  probability  converges  to  1,  as  n  — ►  oo,  that  the 
centroid  of  the  distribution  is  a  1  —  1/e-majority  rule  point. 

How  rapid  is  the  convergence?  In  the  case  of  a  sign-invariant  distribu- 
tion in  two  dimensions,  proposition  1  below  states  we  can  certainly  expect 
an  error  of  order  l/y/n. 

Proposition  1.  Under  the  conditions  of  Theorem  1,  the  largest  majority 
that  can  be  mustered  against  the  origin  has  expected  value  >  \n+vn> . 
Proof:  From  the  proof  of  Theorem  1,  the  gap  function  executes  a  random 
walk  around  0.  The  expected  absolute  distance  from  0  at  the  end  of  a 
random  walk  is  y/n/2  [6],  Dividing  by  the  population  size  n  gives  the 
result. 

I  have  not  been  able  to  determine  rigorous  lower  bounds  in  general.  If 
the  region  is  triangular  instead  of  circular,  the  random  walk  is  not  sta- 
tionary (in  fact  it  is  no  longer  Markovian),  but  heuristically  we  can  again 
expect  the  maximum  gap  to  be  on  the  order  of  y/n  in  expected  value  from 
the  largest  distributional  gap.  The  convergence  theorems  cited  in  the  next 
section  will  tell  us  that  the  error  levels  can  be  expected  not  to  exceed 
0(l/y^). 

With  a  committee  size  of  100  (e.g.  U.S.  Senate),  l/y/n  is  a  fairly  sub- 
stantial 10%.  If  we  seek  an  explanation  for  the  stability  of  2/3-majority  rule 
in  a  group  of  this  size,  therefore,  concavity  is  not  quite  enough.  Concavity 
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together  with  a  limitation  to  2  issues  (m  =  2  dimensions)  might  suffice.  Al- 
ternatively, the  extreme  cases  of  triangular  or  simplicial  distributions  may 
in  reality  be  quite  rare. 

If  the  population  size  is  10,000  or  more,  drawn  from  a  concave  density, 
the  probability  of  stability  under  majority  rule  appears  to  be  fairly  good. 
From  Proposition  1  we  heuristically  may  expect  that  the  maximum  gap 
will  usually  not  exceed  several  multiples  of  the  expected  value  y/n/2,  say 
6(y/n/2)  =  Zy/n.  At  n  =  10,000  this  gap  as  a  fraction  of  population  is 
3 \/l 0000/ 10000  «  3%.  But  however  high  the  policy  space  dimension  m, 
there  is  always  a  "cushion"  of  about  3%  between  2/3  and  1  —  1/e.  On  the 
oher  hand,  When  the  population  size  is  n  =  250  or  less,  equilibrium  may  be 
unlikely.  Tnis  is  because  Proposition  1  suggests  that  a  gap  of  at  least  y/n/2 
will  occur  quite  often.  We  then  have  V250/2(250)  «  3%,  so  the  cushion  is 
not  big  enough  unless  additional  restrictions  are  placed  on  the  preferences 
of  the  voter  population. 

Thus  the  min-max  majority  results  of  [2],  particularly  the  dimension- 
free  bounds,  provide  a  successful  application  of  distributional  analysis  to 
large  finite  populations,  though  some  care  must  be  taken  in  applying  the 
results  to  smaller  committee  sizes. 

6      General  clues 

Why  do  the  distributional  results  discussed  in  section  5  apply  to  large  finite 
populations,  while  those  discussed  previously  do  not?  Part  of  the  answer 
has  to  do  with  the  difference  between  non-dominance  and  strict  dominance. 
Recall  from  section  4  that  the  finite  sample  meaning  of  the  non-dominance 
relation  R  does  not  converge  to  the  meaning  in  the  distributional  case. 
In  contrast,  the  strict  dominance  relation  P  :  yPz  iff  yRz  and  not  yRz 
does  converge.  That  is,  if  yPz  in  the  distributional  sense,  and  a  random 
sample  of  n  points  is  taken,  then  yPz  with  respect  to  that  finite  sample 
with  probability  converging  to  1  as  n  — ►  oo.  (This  follows  immediately  from 
the  weak  law  of  large  numbers  and  Davis  tt  a/.'s  observation  that  uyPz  if 
and  only  if  Pr(||y  -  X\\  <  \\z  -  X\\)  >  1/2.") 

This  difference  is  not  enough.  For  example,  suppose  distribution  /i 
is  uniform  in  a  square  centered  at  y.  Then  for  all  z  ^  y,  yPz  in  the 
distributional  sense.    But  if  a  finite  sample  of  size  2n  is  taken,  then  by 
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Theorem  1  with  probability  converging  to  1  there  will  exist  z  ^  y  such  that 

zPy.    However,  suppose  y  strictly  dominated  all  z  in  some  compact  set 

Z.  We  might  then  argue,  if  fi  were  continuous,  that  the  strict  domination 

occured  with  a  minimum  gap  of  some  6  >  0.   If  we  could  then  find  a  way 

to  reduce  consideration  Z  to  a  finite,  relatively  small  (e.g.  polynomial  in 

n)  number  of  points,  we  could  establish  the  desired  behavior  of  the  finite 

sample.    These  ideas  are  found  in  the  proof  of  Theorem  3  in  [2],  where  £c/M*c'/&f  a*f 

Lemma  1  (page  807)  provide^  the  reduction  to  a  finite  number  (n  +  1) 

of  points.    Similar  ideas  are  found  in  [26],  where  the  fundamental  basis 

extreme  point  theorem  of  linear  programming  provides  the  reduction  to  a 

finite  number. 

The  preceding  suggests  that  the  mathematical  tools  for  the  convergence 
of  empirical  measures  may  be  appropriate  to  these  questions1.  This  turns 
out  to  be  the  case.  The  interested  reader  should  consult  chapter  2,  "Uni- 
form Convergence  of  Empirical  Measures"  of  Pollard's  excellent  book[l8]. 
A  couple  of  the  most  pertinent  results  are  cited  below  (specialized  to  our 
case  and  adapted  to  our  terminology): 

Definition.  Let  n  points  be  drawn  at  random  according  to  a  probability 
measure  /z.  on  9?m.  The  empirical  measure  fxn  is  that  which  places  mass  l/n 
at  each  of  the  n  points  (obviously  they  need  not  be  distinct.) 

Let  C  denote  a  class  of  sets  in  9?m.  For  any  c  G  C,  it  follows  that  /zn(c) 
simply  equals  the  fraction  of  the  points  which  fell  in  c.  The  class  C  of  most 
interest  to  us  is  the  set  of  all  closed  and  open  halfspaces.  Accordingly,  let 

C  =  {c:c  =  lp.x<  p0];p  e  gr,p°  €  3?}  .  (1) 

Also  let  C+  =  [£],  the  set  of  open  halfspaces,  and  let  V  =  C  U  C+ .    The 
uniform  convergence  theorem  of  [18]  implies  that  the  empirical  measure 
converges  to  /x  over  these  classes. 
Theorem  5.  Let  /z  be  a  probability  measure  on  3?m.  Then 

sup  \fjLn(d)  -  n(d)\  ->  0  almost  surely  (2) 

dev 

Proof:  this  follows  from  Theorem  14  (page  18),  Lemma  15(i,ii)(page 
18),  and  Lemma  18  (pages  20-21)  of  [18]. 

XI  am  indebted  to  Bob  Foley,  Richard  McKelvey,  and  Gideon  Weiss  for  suggesting  this 
line  of  attack 
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This  means  that  even  if  we  consider  all  half-spaces  h,  the  largest  gap 
between  the  fraction  of  points  falling  in  the  half-space,  and  the  expected 
fraction  (iu,(h)),  converges  to  0. 

To  demonstrate  the  usefulness  of  Theorem  5,  we  invoke  it  to  prove  the 
convergence  of  the  min-max  majority.  The  first  part  of  Theorem  6  gener- 
alizes Theorem  3  of  [2]  from  bounded  continous  to  arbitrary  distributions, 
the  second  part  of  Theorem  6  is  very  similar  to  2.4 (iii)  and  Proposition 
10  in  [5].  Yet  the  proof  of  Theorem  6  is  much  shorter  and  simpler.  This 
confirms  the  appropriateness  of  this  line  of  attack  (and  the  wisdom  of  my 
colleagues). 

Theorem  6  Let  ^  be  a  probability  measure  on  9ftm.  Let  n  points  be  ran- 
domly independently  sampled  from  fx.  Then  the  min-max  majority  value  of 
the  sample,  o.{^n)  converges  to  the  distributional  min-max  majority  q(m) 
almost  surely.  If  in  addition  \i  is  continuous  and  possesses  unique  min-max 
winner  point  2,  then  the  min-max  winner  of  the  sample  converges  a.s.  to 
z. 

Proof:  If  2  is  an  a-majority  point  with  respect  to  /x  then  by  Theorem  5  it 
will  be  an  a  +  e-majority  point  for  \in  eventually,  for  any  positive  e.  Thus 
limsup{a(//n)}  <  a(/z).  Conversely,  for  any  0  <  a(/z),  set  <5  =  a(n)—  0.  For 
all  x  £  9?m,  there  exists  a  hyperplane  hx  through  x  such  that  a  halfspace  h+ 
defined  by  hx  has  mass  ^{h^)  <  0  +  5.  Again  by  Theorem  5,  the  supremum 
of  the  fractional  discrepancies  over  all  these  halfspaces  converges  to  0  a.s. 
Thus, 

mf\nn{kt)\>  0  +  6/2  (3) 

eventually,  with  probability  1  (a  fraction  of  at  least  0  +  6/2  can  be  mustered 
against  every  point.)  Hence  liminfn{a(//n)}  >  a(/i).  This  proves  the  first 
part  of  Theorem  6. 

The  proof  of  the  first  part  has  moreover  established  that  z  has  limiting 
minimal  winning  supermajority  fraction  a.  It  remains  to  show  that  no 
points  other  than  z  can  also  be  winning  with  fraction  a.  Accordingly  let 
£  >  0  be  arbitrary.  Let  5  C  5Rm  be  an  enormous  ball  containing  z  and  with 
fi(S)  >  ex,  so  that  eventually  with  probability  1  no  point  outside  5  can 
be  an  a-majority  winner.  Let  T  denote  5  with  the  small  ball  of  radius  € 
around  z  removed,  T  =  S\B(z,e).  By  the  compactness  of  T  and  continuity 
of  /x,  there  exists  0  such  that  the  minmax  majority  over  all  x  6  T  equals 
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(3.  By  the  uniqueness  of  z,  0  >  a.  Then  by  the  same  argument  as  led  to 
inequality  3,  eventually  with  probability  1  we  have: 

inf  **(/£)  >  p  -6/2  >  a 

Hence  eventually  no  point  in  T  will  be  an  a-majority  winner.  This  com- 
pletes the  proof. 

Theorem  6  ensures  convergence  of  a(/xn)  holds  for  any  distribution. 
This  is  of  particular  importance  for  empirical  applications,  because  spatial 
voting  data  is  often  discrete.  For  example,  the  Senate  data  in  [10]  and  other 
studies  [24]  are  taken  from  roll  call  votes.  Similarly,  most  public  opinion 
polls  ask  yes/no  questions  or  limit  answers  to  integers  in  a  small  range 
(e.g.  1-5).  In  all  these  cases  the  real  data  will  be  discrete.  Even  if  kernel 
smoothing  ([18,  pp.  35,42])  were  employed  the  resulting  distributions  might 
not  be  continuous.  Also  notice  the  following:  if  two  groups  of  samples  were 
taken  from  //,  Theorem  5  would  ensure  the  convergence  of  the  two  empirical 
measures  to  each  other.  This  matches  the  scenario  described  in  section  1, 
where  information  from  polls  or  past  voting  records  is  used  to  predict  an 
outcome. 

In  general,  we  consider  a  function(al)  /  whose  domain  is  the  set  of 
probability  measures  and  whose  range  is  the  reals.  For  example,  /  might 
be  an  indicator  function  for  the  event  "0  is  undominated",  or  /,•  might  be 
the  ith  coordinate  of  the  center  of  mass  of  the  distribution.  When  /  is 
continuous,  the  uniform  convergence  of  the  empirical  measure  will  ensure 
the  convergence  of  /(/in)  to  /(m)- 

Consider  the  indicator  function  just  defined.  It  is  not  continuous,  in 
the  following  sense:  there  exists  e  >  0  such  that  for  all  A  >  0,  there  exist 
empirical  distributions  \xn  and  Jin  satifying 

sup  \fin(d)  -  ftn(d)\  <  A 

but  \f(nn)  —  /(An)  |  >  e-  (Just  take  e  =  .9).  Moreover  the  discontinuity 
occurs  just  at  the  distributions  of  interest,  where  the  fraction  on  one  side  of 
a  hyperplane  is  1/2.  From  a  more  general  point  of  view,  this  explains  the 
failure  of  finite  behavior  to  converge  to  distributional  behavior  as  discussed 
in  sections  3  and  4. 
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The  mathematical  guideline  for  convergence  is  the  continuity  of  the 
functional.  Let  us  attempt  to  formulate  a  less  technical  rule  of  thumb  to 
give  a  general  sense  of  how  to  make  accurate  predictions  for  finite  popu- 
lations based  on  distributional  results:  if  the  event  or  quantity  of  interest 
depends  on  the  precise  way  voters  are  split  among  regions,  then  a  conver- 
gence problem  is  apt  to  arise;  if  it  relies  instead  on  having  a  certain  fraction 
or  more  in  a  region,  then  the  result  is  apt  to  apply  to  the  large  finite  case, 
possibly  with  the  fraction  perturbed  slightly. 

Let  us  apply  these  observations  to  the  yolk  radius  convergence  shown  in 
[26] .  A  hyperplane  is  median  if  the  two  closed  halfspaces  it  defines  each  con- 
tains at  least  half  the  population.  The  yolk  is  the  smallest  ball  intersecting 
all  median  hyperplanes  [7,14].  If  there  is  a  simple  majority  rule  core  point 
the  yolk  is  that  point.  Under  what  circumstances  can  we  expect  the  yolk 
radius  to  be  small?  From  a  distributional  point  of  view2,  a  yolk  radius  of  0 
corresponds  to  a  nonempty  core.  Necessary  and  sufficient  conditions  for  a 
nonempty  core,  in  the  distributional  sense,  are  (see  [3,15])  that  /x  be  weakly 
centered:  every  hyperplane  through  0  is  a  median  hyperplane.  Therefore  a 
distributional  analysis  predicts  that  weak  centeredness  would  be  necessary 
and  sufficient  for  the  yolk  radius  of  random  samples  to  converge  to  0. 

Our  rule  of  thumb  suggests  that  there  may  be  a  problem  with  the  exact 
50:50  split  of  the  weak  centeredness  condition,  but  that  a  (50  +  e)  :  (50  — 
e)  splitting  condition  would  be  apt  to  work.  It  turns  out  that  the  true 
necessary  and  sufficient  condition  is  that  fj.  be  strictly  centered[26]:  for  every 
hyperplane  not  passing  through  0,  the  halfspace  it  defines  not  containing 
the  origin  must  contain  strictly  less  than  half  the  population.  This  outcome 
seems  well  in  accord  with  the  guidelines  proposed  above. 

We  can  invoke  Theorem  5  to  prove  the  sufficiency  half  of  this  result3, 
though  under  an  additional  assumption  of  continuity  of  the  distribution  /z. 
Despite  the  lessened  generality  of  Theorem  7,  the  ease  and  brevity  of  its 
proof  are  noteworthy.  Theorem  7.  Let  n  points  be  sampled  independently 

from  /z,  a  strictly  centered  continuous  distribution  on  9Jm.  Then  the  radius 
of  the  yolk  of  the  sample  converges  to  0  a.s.  as  n  — ►  oo. 
Proof:  see  Appendix. 


2this  distributional  analysis  is  due  to  Richard  McKelvey 

3  the  essentials  of  this  proof  were  suggested  to  me  independently  by  Robert  Foley, 


Richard  McKelvey,  Loren  Platzman,  and  Gideon  Weiss. 
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7      Rereading  Tullock's  paper  on  the  general 
irrelevance... 

The  results  in  this  paper  might  seem  to  invalidate  claims  in  Tullock's  orig- 
inal work.  A  careful  reading  shows  this  is  not  so.  Tullock's  original  paper, 
"The  general  irrelevance  of  the  general  impossibility  theorem"  [28],  is  in  my 
opinion  an  altogether  brilliant  piece  of  work,  combining  important  empiri- 
cal evidence  (the  scarcity  of  actual  cycling  or  chaos)  with  abundant  creative 
inspiration  and  exceptional  mathematical  intuition  (as  well  as  dramatic  ex- 
position). A  careful  reading  reveals  that  Tullock  is  actually  discussing  finite 
configurations,  and  only  appeals  to  the  infinite  configurations  as  an  intu- 
itive aid.  For  example,  after  describing  a  uniform  distributional  model, 
Tullock  writes  ([28,  page  259]): 

This  might  be  called  the  perfect  geometrical  model,  in  which 
the  number  of  voters  whose  optima  fall  in  a  given  area  is  ex- 
actly proportional  to  its  area.  Given  that  the  voters  are  finite 
in  number,  small  discontinuities  would  appear.  Two  areas  that 
differ  little  in  size  might  have  the  same  number  of  voters;  in- 
deed, the  smaller  might  even  have  more.  Cycles  are,  therefore, 
possible,  but  they  would  become  less  and  less  important  as  the 
number  of  choosing  individuals  increases. 

Later,  Tullock  specifically  remarks  that  the  probability  of  cycling  should 
increase  as  the  population  grows  [IBID,  page  261]: 

For  close  to  the  center,  the  area  which  is  preferred  to  A 
would  be  farther  from  the  center  than  A.  Cycling  becomes 
more  probable.  When  we  get  very  close  to  the  center  a  point 
randomly  selected  from  among  those  which  could  get  a  majority 
over  the  given  point  would  have  a  good  chance  of  being  farther 
from  the  center  than  it  is.  At  this  point,  however,  most  voters 
will  feel  that  new  proposals  are  splitting  hairs,  and  the  motion 
to  adjourn  will  carry. 
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This  intuitive  statement  is  in  accord  with  Theorem  1.  Thus  Tullock  is  not 
claiming  that  cycles  won't  usually  exist  in  large  populations  4.  Tullock's 
main  point  is  that  they  won't  matter. 

One  of  the  arguments  Tullock  advances  to  support  his  point  is  that 
unless  proposals  were  carefully  manipulated,  "the  voting  process  would  in 
all  probability  lead  to  rapid  movement  toward  the  center  [28,  page26l]. 
This  argument  is  actually  a  loose  forerunner  of  the  yolk,  the  smallest  ball 
intersecting  all  median  hyperplanes.  (Tullock's  discussion  of  intersections 
of  median  lines,  pages  261-262,  is  especially  evocative  of  the  yolk.) 

Since  that  time  the  yolk  has  been  rigorously  established  by  Ferejohn, 
McKelvey,  and  Packel  [7]  and  McKelvey  [14].  More  recently  it  has  been 
proved  that  the  radius  of  the  yolk  does  converge  to  0  a.s.  for  the  distribution 
of  Tullock's  example  (or  any  other  centered  distribution)  [26].  Considering 
the  length  of  time  by  which  Tullock's  work  preceded  the  mathematical 
development  of  the  appropriate  technical  tools,  Tullock's  insights  seem  all 
the  more  remarkable. 
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9      Appendix:  Proof  of  Theorems  4  and  7 

Proof  of  Theorem  4:  The  three  lines  through  the  triangle  center  in  Figure  5 
divide  the  triangle  into  the  six  regions  labelled  a,  b,  c,  d,  e,  /.  For  notational 
ease,  let  the  region  label  also  represent  the  number  of  sample  points  falling 
in  that  region.  If  the  center  is  to  be  a  5/9-majority  point,  then  b  +  c  +  d  < 
5/9,  and  similarly  d  +  e  +  f  <  5/9;  /  +  a  +  b  <  5/9.  These  imply  our  key 
inequalities:  6  -  e  <  1/9;  c-  f  <  1/9;  d  -  a  <  1/9.  That  is, 

He  also  argues  that  "it  is  possible,  by  simple  majority  voting,  to  reach  points  at  almost 
any  portion  of  the  issue  space",  an  adumbration  of  the  classic  chaos  theorems  of  McKelvey 
and  Schofield  [12,13,21,22] 
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the  number  of  points  in  each  rhombus  is  no  more  than  n/9  more  than 
the  number  of  points  in  the  opposing  triangle.  Applying  the  strong  law  of 
large  numbers,  the  actual  number  in  each  region,  for  large  n,  will  be  within 
0(y/n)  of  its  expected  value  with  very  high  probability  (geometrically  de- 
creasing chance  of  failure).  We  may  therefore  condition  on  the  partitioning 
among  the  three  rhombus-triangle  pairs  being  close  to  the  expected  value 
of  n/3  in  each  of  these  three  paired  regions,  and  the  error  in  our  result- 
ing estimate  converges  to  0.  Once  we  condition  on  this  likely  event,  the 
three  key  inequalities  become  independent.  Now  approximating  the  bino- 
mial distribution  of  parameters  ~  n/3, 1/3  with  a  normal  distribution,  (by 
the  strong  law  of  large  numbers),  and  since  n1'2  dominates  n1/4,  it  follows 
that  the  probability  is  asymptotically  1/2  that  the  gap  between  rhombus 
and  opposing  triangle  of  the  three  inequalities.  (In  other  words  the  median 
and  mean  of  the  binomial  are  very  close).  Therefore,  the  conditional  prob- 
ability that  that  three  key  inequalities  all  hold  is  asymptotically  1/8.  Thus 
pn  in  the  limit  is  bounded  by  1/8.  This  proves  Theorem  4. 

The  upper  bound  of  1/8  in  Theorem  4  can  be  extended  easily  to  l/2m+1 
for  m  dimensions. 

I  would  moreover  conjecture  that  pn  —*  0  as  n  — ►  oo. 

Theorem  75  Let  n  points  be  sampled  independently  from  /j.,  a  strictly 
centered  continuous  distribution  on  3ftm.  Then  the  radius  of  the  yolk  of  the 
sample  converges  to  0  a.s.  as  n  — *  oo. 

Proof:  Following  the  proof  in  [26],  we  show  that  the  largest  distance 
from  0  to  any  median  hyperplane  converges  to  0.  Since  this  distance  is  an 
upper  bound  on  the  yolk  radius,  the  result  will  follow. 

For  any  i^  0,  let  h+  denote  the  halfspace  not  containing  the  origin 
denned  by  the  hyperplane  normal  at  x.  By  strict  centeredness  ii(h+ )  <  1/2. 
By  continuity  ^(h*)  is  continuous  in  x. 

Let  €  >  0  be  arbitrary.     Clearly  the  largest  vote  attained  against  0 

by  points  e  or  more  away  from  0  is  attained  by  points  c  away,  or  more 

accurately 

sup  n(k+)  =  sup  p[h£). 
IMI>«  ll*ll=« 

By  compactness  of  the  set  the  latter  supremum  is  taken  over,  and  continuity, 


'see  the  acknowledgment  footnote,  page  18. 
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the  supremum  is  attained.    Thus  there  exists  /3  <  1/2  such  that  for  all 
||x||  >  e,  we  have  Ai(/i+)  < /?. 

The  halfspaces  h*  are  contained  in  the  class  C.  Let  the  n  points  be 
sampled  from  p.  Apply  Theorem  5  to  find  that  with  probability  1,  as  n 
increases, 

Pn(K)  <  ^^  <  1/2V||X||  >  6. 

This  implies  that  there  is  no  median  hyperplane  at  distance  e  or  more 
from  0,  whence  the  result  follows. 
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